Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Spike] sdk/log: Change FilterProcessor to Filterer #5825

Closed
wants to merge 7 commits into from

Conversation

pellared
Copy link
Member

@pellared pellared commented Sep 17, 2024

Towards open-telemetry/opentelemetry-specification#4207

Based on Proposal B

This is another idea of implementing filtering to the Logs SDK.

I think this design is more in the spirit of the specification and current SDK design. The initial implementation of Enabled was designed when wrapping processors was also needed for implementing log record mutations during processing as log records were passed by value (and not by pointer).

This approach should make adding custom filtering easier for the users in most cases (see the updated ExampleFilterer).

There is a little (mostly CPU) performance overhead caused by looping over filterers in logger.Emit, but I do not find it significant.

goos: linux
goarch: amd64
pkg: go.opentelemetry.io/otel/sdk/log
cpu: Intel(R) Core(TM) i9-10885H CPU @ 2.40GHz
                                  │   old.txt    │                new.txt                │
                                  │    sec/op    │    sec/op      vs base                │
BatchProcessorOnEmit-16             286.6n ±  9%    402.6n ±  9%  +40.43% (p=0.000 n=10)
Processor/Simple-16                 689.4n ±  5%   1018.0n ± 31%  +47.65% (p=0.000 n=10)
Processor/Batch-16                  1.301µ ± 11%    1.643µ ± 21%  +26.29% (p=0.000 n=10)
Processor/SetTimestampSimple-16     680.9n ±  2%    884.5n ±  8%  +29.88% (p=0.000 n=10)
Processor/SetTimestampBatch-16      1.250µ ±  3%    1.597µ ±  8%  +27.76% (p=0.000 n=10)
Processor/AddAttributesSimple-16    784.1n ±  4%    953.4n ±  3%  +21.58% (p=0.000 n=10)
Processor/AddAttributesBatch-16     1.453µ ±  5%    1.712µ ±  5%  +17.79% (p=0.000 n=10)
Processor/SetAttributesSimple-16    743.6n ±  7%    971.8n ± 10%  +30.70% (p=0.000 n=10)
Processor/SetAttributesBatch-16     1.392µ ±  3%    1.409µ ± 33%        ~ (p=0.352 n=10)
LoggerNewRecord/5_attributes-16     288.9n ±  3%    283.3n ±  7%        ~ (p=0.739 n=10)
LoggerNewRecord/10_attributes-16    1.290µ ±  3%    1.253µ ± 13%        ~ (p=0.393 n=10)
LoggerEnabled-16                    7.062n ± 10%    4.029n ±  7%  -42.96% (p=0.000 n=10)
LoggerProviderLogger-16             593.8n ±  3%    626.7n ±  5%   +5.55% (p=0.001 n=10)
WalkAttributes/1_attributes-16      4.665n ±  7%    4.415n ±  3%   -5.35% (p=0.035 n=10)
WalkAttributes/10_attributes-16     4.951n ± 11%    4.311n ±  1%  -12.93% (p=0.000 n=10)
WalkAttributes/100_attributes-16    4.668n ± 10%    4.275n ±  1%   -8.42% (p=0.000 n=10)
WalkAttributes/1000_attributes-16   4.859n ±  4%    4.283n ±  1%  -11.85% (p=0.000 n=10)
SetAddAttributes/SetAttributes-16   169.4n ±  7%    173.4n ±  9%        ~ (p=0.363 n=10)
SetAddAttributes/AddAttributes-16   118.7n ±  1%    121.4n ± 19%   +2.27% (p=0.001 n=10)
SimpleProcessorOnEmit-16            1.764n ± 12%    1.732n ±  8%        ~ (p=0.353 n=10)
geomean                             139.2n          147.6n         +6.03%

                        │    old.txt    │               new.txt                │
                        │      B/s      │     B/s       vs base                │
BatchProcessorOnEmit-16   106.47Mi ± 8%   75.84Mi ± 9%  -28.77% (p=0.000 n=10)

                                  │    old.txt     │                new.txt                │
                                  │      B/op      │     B/op      vs base                 │
BatchProcessorOnEmit-16               0.000 ± 0%       0.000 ± 0%       ~ (p=1.000 n=10) ¹
Processor/Simple-16                   417.0 ± 0%       417.0 ± 0%       ~ (p=1.000 n=10) ¹
Processor/Batch-16                  1.079Ki ± 1%     1.075Ki ± 1%       ~ (p=0.643 n=10)
Processor/SetTimestampSimple-16       417.0 ± 0%       417.0 ± 0%       ~ (p=1.000 n=10) ¹
Processor/SetTimestampBatch-16      1.074Ki ± 1%     1.077Ki ± 1%       ~ (p=0.210 n=10)
Processor/AddAttributesSimple-16      417.0 ± 0%       417.0 ± 0%       ~ (p=1.000 n=10) ¹
Processor/AddAttributesBatch-16     1.063Ki ± 1%     1.075Ki ± 1%       ~ (p=0.117 n=10)
Processor/SetAttributesSimple-16      465.0 ± 0%       465.0 ± 0%       ~ (p=1.000 n=10) ¹
Processor/SetAttributesBatch-16     1.099Ki ± 1%     1.109Ki ± 1%       ~ (p=0.060 n=10)
LoggerNewRecord/5_attributes-16       0.000 ± 0%       0.000 ± 0%       ~ (p=1.000 n=10) ¹
LoggerNewRecord/10_attributes-16      610.0 ± 0%       610.0 ± 0%       ~ (p=1.000 n=10) ¹
LoggerEnabled-16                      0.000 ± 0%       0.000 ± 0%       ~ (p=1.000 n=10) ¹
LoggerProviderLogger-16               345.5 ± 6%       365.0 ± 5%  +5.64% (p=0.007 n=10)
WalkAttributes/1_attributes-16        0.000 ± 0%       0.000 ± 0%       ~ (p=1.000 n=10) ¹
WalkAttributes/10_attributes-16       0.000 ± 0%       0.000 ± 0%       ~ (p=1.000 n=10) ¹
WalkAttributes/100_attributes-16      0.000 ± 0%       0.000 ± 0%       ~ (p=1.000 n=10) ¹
WalkAttributes/1000_attributes-16     0.000 ± 0%       0.000 ± 0%       ~ (p=1.000 n=10) ¹
SetAddAttributes/SetAttributes-16     48.00 ± 0%       48.00 ± 0%       ~ (p=1.000 n=10) ¹
SetAddAttributes/AddAttributes-16     0.000 ± 0%       0.000 ± 0%       ~ (p=1.000 n=10) ¹
SimpleProcessorOnEmit-16              0.000 ± 0%       0.000 ± 0%       ~ (p=1.000 n=10) ¹
geomean                                          ²                 +0.38%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                                  │   old.txt    │               new.txt               │
                                  │  allocs/op   │ allocs/op   vs base                 │
BatchProcessorOnEmit-16             0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Processor/Simple-16                 1.000 ± 0%     1.000 ± 0%       ~ (p=1.000 n=10) ¹
Processor/Batch-16                  1.000 ± 0%     1.000 ± 0%       ~ (p=1.000 n=10) ¹
Processor/SetTimestampSimple-16     1.000 ± 0%     1.000 ± 0%       ~ (p=1.000 n=10) ¹
Processor/SetTimestampBatch-16      1.000 ± 0%     1.000 ± 0%       ~ (p=1.000 n=10) ¹
Processor/AddAttributesSimple-16    1.000 ± 0%     1.000 ± 0%       ~ (p=1.000 n=10) ¹
Processor/AddAttributesBatch-16     1.000 ± 0%     1.000 ± 0%       ~ (p=1.000 n=10) ¹
Processor/SetAttributesSimple-16    2.000 ± 0%     2.000 ± 0%       ~ (p=1.000 n=10) ¹
Processor/SetAttributesBatch-16     2.000 ± 0%     2.000 ± 0%       ~ (p=1.000 n=10) ¹
LoggerNewRecord/5_attributes-16     0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
LoggerNewRecord/10_attributes-16    4.000 ± 0%     4.000 ± 0%       ~ (p=1.000 n=10) ¹
LoggerEnabled-16                    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
LoggerProviderLogger-16             1.000 ± 0%     1.000 ± 0%       ~ (p=1.000 n=10) ¹
WalkAttributes/1_attributes-16      0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
WalkAttributes/10_attributes-16     0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
WalkAttributes/100_attributes-16    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
WalkAttributes/1000_attributes-16   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
SetAddAttributes/SetAttributes-16   1.000 ± 0%     1.000 ± 0%       ~ (p=1.000 n=10) ¹
SetAddAttributes/AddAttributes-16   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
SimpleProcessorOnEmit-16            0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
geomean                                        ²               +0.00%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

Copy link

codecov bot commented Sep 17, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 84.5%. Comparing base (534ce5a) to head (bbbdc97).
Report is 304 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@          Coverage Diff          @@
##            main   #5825   +/-   ##
=====================================
  Coverage   84.5%   84.5%           
=====================================
  Files        272     273    +1     
  Lines      22734   22739    +5     
=====================================
+ Hits       19224   19231    +7     
+ Misses      3167    3165    -2     
  Partials     343     343           

see 5 files with indirect coverage changes

@pellared
Copy link
Member Author

pellared commented Sep 18, 2024

@open-telemetry/go-approvers, PTAL.
WDYT about such proposal?
I am open to any/other suggestions.

Copy link
Member

@XSAM XSAM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of having a similar design in the trace provider.

But I find this naming, Filterer, a little strange, as I am not sure there is a word for that. Maybe Enabler? 🤔

@@ -30,6 +29,7 @@ const (
type providerConfig struct {
resource *resource.Resource
processors []Processor
filterers []Filterer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just like the sampler in TracerProvider, could we only have a single instance here? If no one is complaining about the current implementation of the sampler, maybe it is ok to have just one instance.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this approach is simpler and more consistent with how processing works.

For reference, the samplers can be decorators e.g. ParentBased Sampler. There is nothing technically preventing creating a filterer decorator. However, I feel that accepting multiple filterers increases the ease of use.

@pellared
Copy link
Member Author

pellared commented Sep 19, 2024

But I find this naming, Filterer, a little strange, as I am not sure there is a word for that. Maybe Enabler? 🤔

Initially I wanted to name it Enabler. However, it felt to me like it people would think that it is something that enables or disables a logger completely. Where it actually is an abstraction responsible for filtering log records. Moreover, it is used in both Enabled and Emit.

Comment on lines +38 to +39
param.SetSeverity(r.Severity())
if !l.Enabled(ctx, param) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it should be possible to make it possible to filter out based on more parameters than passed via Logger.Enabled? E.g. log record body, attributes, etc.

This could be enhanced later without making any breaking change.

@pellared
Copy link
Member Author

pellared commented Dec 18, 2024

Another feedback received during calls is that this proposal is less flexible and makes declaring some pipelines harder.

For example, how to define 2 log record processing pipelines with different filters? For such case the filterer would have to be attached to a processor (not globally for the whole SDK)

Chaining filters that are coupled to emitted logs is also awkward e.g.

// Both processors also implement the Filterer interface.
etw := &ETWLogProcessor{} 
proc := &RateLimitingProcessor{
  Processor: etw,
  Duration: time.Second,
  Size: 1000,
}
provider := log.NewLoggerProvider(
  log.WithProcessor(proc),
  log.WithFilterer(proc),
  log.WithFilterer(etw),
)

It makes decorating processors more bug-prone.

Not mentioning more advanced, nested (tree structure) of processing. Context: open-telemetry/opentelemetry-specification#4407

@pellared pellared closed this Feb 5, 2025
jsuereth pushed a commit to open-telemetry/opentelemetry-specification that referenced this pull request Mar 25, 2025
Fixes
#4363

Towards
#4208
(uses Severity Level passed via Logger.Enabled)

Towards stabilization of OpenTelemetry Go Logs API and SDK.

## Use cases

Below are some use cases where the new functionality can be used:

1. Bridge features like `LogLevelEnabled` in log bridge/appender
implementations. This is needed for **all** (but one) currently
supported log bridges in OTel Go Contrib.
2. Configure a minimum log severity level for a certain log processor.
3. Filter out log and event records when they are inside a span that has
been sampled out (span is valid and has sampled flag of `false`).
4. **Efficiently** support high-performance logging destination like
[Linux user_events](https://docs.kernel.org/trace/user_events.html) and
[ETW (Event Tracing for
Windows)](https://learn.microsoft.com/windows/win32/etw/about-event-tracing).
5. Bridge Logs API to a language-specific logging library (the other way
than usual).

## Changes

Add `Enabled` opt-in operation to the `LogRecordProcessor`.

I created an OTEP first which was a great for having a lot of
discussions and evaluations of different proposals:
-
#4290

Most importantly from
#4290 (comment):

> Among Go SIG we were evaluating a few times an alternative to provide
some new "filter" abstraction which is decoupled from the "processor".
However, we faced more issues than benefits going this route (some if
this is described here, but there were more issues:
open-telemetry/opentelemetry-go#5825 (comment)
. With the current opt-in `Processor.Enabled` we faced less issues so
far.
> We also do not want to replicate all features from the logging
libraries. If someone prefer the log4j (or other) filter design then
someone can always use a bridge and use log4j for filtering. `Enabled`
callback hook is the simplest design (yet very flexible) which makes it
easy to implement in the SDKs. This design is inspired from the design
of the two most popular Go structured logging libraries:
https://pkg.go.dev/log/slog (standard library) and
https://pkg.go.dev/go.uber.org/zap.
> 
> It is worth to adding that Rust design is similar and it also has an
`Enabled` hook. See
#4363 (comment).
Basically we want to add something like
https://docs.rs/log/latest/log/trait.Log.html#tymethod.enabled to the
`LogRecordProcessor` and allow users to implement `Enabled` in the way
that it will meet their requirements.
> 
> I also want to call out form
https://github.com/open-telemetry/opentelemetry-specification/blob/main/oteps/0265-event-vision.md#open-questions:
> 
> > How to support routing logs from the Logs API to a language-specific
logging library
> 
> To support this we would need a log record processor which bridges the
Logs API calls to given logging library. For such case we would need an
`Enabled` hook in `Processor` to efficiently bridge `Logger.Enabled`
calls. A filterer design would not satisfy such use case.

I decided to name the new operation `Enabled` as:
1. this name is already used in logging libraries in many languages:
#4439 (comment)
2. it matches the name of the API call (for all trace, metrics and logs
APIs).

I was also considering `OnEnabled` to have the same pattern as for
`Emit` and `OnEmit`. However, we already have `ForceFlush` and
`Shutdown` which does not follow this pattern so I preferred to keep the
simple `Enabled` name. For `OnEmit` I could also imagine `OnEmitted` (or
`OnEmitting`) which does something after (or just before like we have
`OnEnding` in `SpanProcessor`) `OnEmit` on all registered processors
were called. Yet, I do not imagine something similar for `Enabled` as
calling `Enabled` should not have any side-effects. Therefore, I decided
to name it `Enabled`.

I want to highlight that a processor cannot assume `Enabled` was called
before `OnEmit`, because of the following reasons:

1. **Backward compatibility** – Existing processors may already perform
filtering without relying on `Enabled`. For example: [Add Advanced
Processing to Logs Supplementary Guidelines
#4407](#4407).
2. **Self-sufficiency of `OnEmit`** – Since `Enabled` is optional,
`OnEmit` should be able to handle filtering independently. A processor
filtering events should do so in `OnEmit`, not just in `Enabled`.
3. **Greater flexibility** – Some processors, such as the ETW processor,
don’t benefit from redundant filtering. ETW already filters out events
internally, making an additional check unnecessary.
4. **Performance considerations** – Calling `Enabled` from `OnEmit`
introduces overhead, as it requires converting `OnEmit` parameters to
match `Enabled`'s expected input.
5. **Avoiding fragile assumptions** – Enforcing constraints that the
compiler cannot validate increases the risk of introducing bugs.


This feature is already implemented in OpenTelemetry Go:
- open-telemetry/opentelemetry-go#6317
We have one processor in Contrib which takes advantage of this
functionality:
- https://pkg.go.dev/go.opentelemetry.io/contrib/processors/minsev

This feautre (however with some differences) is also avaiable in OTel
Rust;
#4363 (comment):

> OTel Rust also has this capability. Here's an example where it is
leveraged to improve performance by dropping unwanted log early.
https://github.com/open-telemetry/opentelemetry-rust/blob/88cae2cf7d0ff54a042d281a0df20f096d18bf82/opentelemetry-appender-tracing/benches/logs.rs#L78-L85

---------

Co-authored-by: Tyler Yahn <[email protected]>
Co-authored-by: Sam Xie <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants